모바일 디바이스 화면의 클릭 가능한 객체 탐지를 위한 싱글 샷 디텍터

조민석; 전혜원; 한성수; 정창성; Min-Seok Jo; Hye-won Chun; Seong-Soo Han; Chang-Sung Jeong

연구문헌

국내 논문지

홈 > 연구문헌 > 국내 논문지 > 한국정보처리학회 논문지 > 정보처리학회 논문지 소프트웨어 및 데이터 공학

정보처리학회 논문지 소프트웨어 및 데이터 공학

Current Result Document :

한글제목(Korean Title)	모바일 디바이스 화면의 클릭 가능한 객체 탐지를 위한 싱글 샷 디텍터
영문제목(English Title)	Single Shot Detector for Detecting Clickable Object in Mobile Device Screen
저자(Author)	조민석 전혜원 한성수 정창성 Min-Seok Jo Hye-won Chun Seong-Soo Han Chang-Sung Jeong
원문수록처(Citation)	VOL 11 NO. 01 PP. 0029 ~ 0034 (2022. 01)
한글내용 (Korean Abstract)	모바일 디바이스 화면상의 클릭 가능한 객체를 인지하기 위한 데이터셋을 구축하고 새로운 네트워크 구조를 제안한다. 모바일 디바이스 화면에서 클릭 가능한 객체를 기준으로 다양한 해상도를 가진 디바이스에서 여러 애플리케이션을 대상으로 데이터를 수집하였다. 총 24,937개의 annotation data를 text, edit text, image, button, region, status bar, navigation bar의 7개 카테고리로 세분화하였다. 해당 데이터셋을 학습하기 위한 모델 구조는 Deconvolution Single Shot Detector를 베이스라인으로, backbone network는 기존 ResNet에 Squeeze-and-Excitation block을 추가한 Squeeze-and-Excitation networks를 사용하고, Single shot detector layers와 Deconvolution module을 Feature pyramid networks 형태로 쌓아 올려 header와 연결한다. 또한, 기존 input resolution의 1:1 비율에서 오는 특징의 손실을 최소화하기 위해 모바일 디바이스 화면과 유사한 1:2 비율로 변경하였다. 해당 모델을 구축한 데이터셋에 대하여 실험한 결과 베이스라인에 대비하여 mean average precision이 최대 101% 개선되었다.
영문내용 (English Abstract)	We propose a novel network architecture and build dataset for recognizing clickable objects on mobile device screens. The data was collected based on clickable objects on the mobile device screen that have numerous resolution, and a total of 24,937 annotation data were subdivided into seven categories: text, edit text, image, button, region, status bar, and navigation bar. We use the Deconvolution Single Shot Detector as a baseline, the backbone network with Squeeze-and-Excitation blocks, the Single Shot Detector layer structure to derive inference results and the Feature pyramid networks structure. Also we efficiently extract features by changing the input resolution of the existing 1:1 ratio of the network to a 1:2 ratio similar to the mobile device screen. As a result of experimenting with the dataset we have built, the mean average precision was improved by up to 101% compared to baseline.
키워드(Keyword)	테스트 자동화 안드로이드 객체 탐지 모바일 화면 인지 컴퓨터 비전 딥러닝 Test Automation Android Object Detection Test Automation Android Object Detection Computer Vision Deep Learning
파일첨부	PDF 다운로드